Protein Structure Prediction : Selecting Salient Featuresfrom
نویسندگان
چکیده
We introduce a parallel approach, \DT-Select," for selecting features used by inductive learning algorithms to predict protein secondary structure. DT-Select is able to rapidly choose small, nonre-dundant feature sets from pools containing hundreds of thousands of potentially useful features. It does this by building a decision tree, using features from the pool, that classiies a set of training examples. The features included in the tree provide a compact description of the training data and are thus suitable for use as inputs to other inductive learning algorithms. Empirical experiments in the protein secondary-structure task, in which sets of complex features chosen by DT-Select are used to augment a standard artiicial neural network representation, yield surprisingly little performance gain, even though features are selected from very large feature pools. We discuss some possible reasons for this result. 1
منابع مشابه
Protein Structure Prediction: Selecting Salient Features from Large Candidate Pools
We introduce a parallel approach, "DT-SELECT," for selecting features used by inductive learning algorithms to predict protein secondary structure. DT-SELECT is able to rapidly choose small, nonredundant feature sets from pools containing hundreds of thousands of potentially useful features. It does this by building a decision tree, using features from the pool, that classifies a set of trainin...
متن کاملProtein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches
DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...
متن کاملA Contact-Assisted Approach to Protein Structure Structure Prediction and Its Assessment in CASP10
Among different approaches to predict the 3D structure of a protein, one important idea is to predict a protein residueresidue contact map and then construct a full 3D structure from the contact-map. Instead of building a structure purely from contacts information, here we describe a contactassisted structure prediction approach that uses only a few known contacts to improve the quality of alre...
متن کاملA Contact-assisted Approach to Protein Structure Prediction and Its Assessment in CASP10
Among different approaches to predict the 3D structure of a protein, one important idea is to predict a protein residueresidue contact map and then construct a full 3D structure from the contact-map. Instead of building a structure purely from contacts information, here we describe a contactassisted structure prediction approach that uses only a few known contacts to improve the quality of alre...
متن کاملIn Silico Prediction and Docking of Tertiary Structure of Multifunctional Protein X of Hepatitis B Virus
Hepatitis B virus (HBV) infection is a universal health problem and may result into acute, fulminant, chronic hepatitis liver cirrhosis, or hepatocellular carcinoma. Sequence for protein X of HBV was retrieved from Uniprot database. ProtParam from ExPAsy server was used to investigate the physicochemical properties of the protein. Homology modeling was carried out using Phyre2 server, and refin...
متن کامل